Tracks#

Found in the sat_midi_file table for the silver layer.

Hide code cell source

import lakh_midi_dataset
from ydata_profiling import ProfileReport
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
import pandas as pd
import numpy as np
from functools import partial

# Configure Plotly for static output in Jupyter Book
pio.renderers.default = "png"
%%sql
ATTACH 'hf://datasets/nintorac/ntrc_lakh_midi/lakh_remote.duckdb' AS lakh_remote;
Success

Feature Analysis#

First we show the data profile of the simple features stored in the track satellite.

%%sql -o track_df -t df
select 
    track_hk,
    audio_md5,
    analysis_sample_rate,
    danceability,
    duration,
    end_of_fade_in,
    energy,
    key_signature_id,
    key_confidence,
    loudness,
    mode_id,
    mode_confidence,
    start_of_fade_out,
    tempo,
    time_signature,
    time_signature_confidence,
    title,
    genre,
    year,
    analyzer_version,
    song_id,
    song_hotttnesss,
    idx_bars_confidence,
    idx_bars_start,
    idx_beats_confidence,
    idx_beats_start,
    idx_sections_confidence,
    idx_sections_start,
    idx_segments_confidence,
    idx_segments_loudness_max,
    idx_segments_loudness_max_time,
    idx_segments_loudness_start,
    idx_segments_pitches,
    idx_segments_start,
    idx_segments_timbre,
    idx_tatums_confidence,
    idx_tatums_start,
    -- skip heavy array columns for now
    --bars_start,
    --bars_confidence,
    --beats_start,
    --beats_confidence,
    --sections_start,
    --sections_confidence,
    --segments_start,
    --segments_confidence,
    --segments_loudness_max,
    --segments_loudness_max_time,
    --segments_loudness_start,
    --segments_pitches,
    --segments_timbre,
    --tatums_start,
    --tatums_confidence,
    load_date,
    record_source,
    partition_col
from lakh_remote.sat_track 
track_hk audio_md5 analysis_sample_rate danceability duration end_of_fade_in energy key_signature_id key_confidence loudness ... idx_segments_loudness_max_time idx_segments_loudness_start idx_segments_pitches idx_segments_start idx_segments_timbre idx_tatums_confidence idx_tatums_start load_date record_source partition_col
0 0001a83de2a386e205d9b7e9d4d814a2 56157caefe4ed1557913dd3468ea9909 22050 0.0 195.89179 1.402 0.0 0 0.697 -16.449 ... 0 0 0 0 0 0 0 2025-08-04 18:02:13.753000+10:00 lmd_h5 0
1 0002a3acbeb60650367c00e41df9def6 da38decfe061e681cbe6dfc672d70fc0 22050 0.0 261.14567 0.386 0.0 1 0.432 -5.560 ... 0 0 0 0 0 0 0 2025-08-04 18:02:13.753000+10:00 lmd_h5 0
2 0002ceff4a60499a828eeca803d606a5 abc36beb607203624cacb26fe6819cdf 22050 0.0 214.77832 4.075 0.0 10 0.609 -17.357 ... 0 0 0 0 0 0 0 2025-08-04 18:02:13.753000+10:00 lmd_h5 0
3 000a930f9e27f299bbcff372c2c1d421 c0f2ebb7c3f227ab54d40c336995fd30 22050 0.0 397.53098 0.000 0.0 6 0.386 -5.909 ... 0 0 0 0 0 0 0 2025-08-04 18:02:13.753000+10:00 lmd_h5 0
4 0011011b8203cf2460fae820f41bf8c3 a03c7b1d22a6316b664002daa9c5336b 22050 0.0 361.03791 2.984 0.0 0 0.253 -6.194 ... 0 0 0 0 0 0 0 2025-08-04 18:02:13.753000+10:00 lmd_h5 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
31029 fff9079e53d71e9a7e6021cfd5923d99 e850c169f340ee7ae26c02b8b6bee6fa 22050 0.0 197.09342 0.000 0.0 7 0.042 -5.320 ... 0 0 0 0 0 0 0 2025-08-04 18:10:23.245000+10:00 lmd_h5 f
31030 fffc727fa5a6f0b4aec9cc16eda26c29 b3babf62e9c0036d0ed2f39b069da2c3 22050 0.0 214.49098 0.706 0.0 6 0.413 -8.658 ... 0 0 0 0 0 0 0 2025-08-04 18:10:23.245000+10:00 lmd_h5 f
31031 fffc9ec23d4fdbeef5a4a9b211737ac9 62434d673744a1eed247950a61182a47 22050 0.0 139.17995 2.659 0.0 11 0.849 -4.118 ... 0 0 0 0 0 0 0 2025-08-04 18:10:23.245000+10:00 lmd_h5 f
31032 fffca08a6b93e64c0e1e597376bff87f b14d6267936c1c8757eb477ce3986355 22050 0.0 164.72771 0.000 0.0 2 0.995 -9.281 ... 0 0 0 0 0 0 0 2025-08-04 18:10:23.245000+10:00 lmd_h5 f
31033 fffcb2b0507486bbefbd8553d34e6de4 48954f09300d9a2820bfa70517e104eb 22050 0.0 295.00036 2.218 0.0 9 0.645 -8.204 ... 0 0 0 0 0 0 0 2025-08-04 18:10:23.245000+10:00 lmd_h5 f

31034 rows Γ— 40 columns

We display a data profile on the dataset, this contains various format specific analysis and visualisations to help understand the shape of the data.

Tip

Check out the correlation between year and loudness to see how music has been getting louder over time

Hide code cell source

# run dataset profiling
profile = ProfileReport(track_df, progress_bar=False)
profile.to_notebook_iframe()
  0%|                                       | 0/40 [00:00<?, ?it/s]
  2%|β–Š                              | 1/40 [00:03<02:20,  3.61s/it]
 42%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š                 | 17/40 [00:03<00:03,  5.89it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 40/40 [00:03<00:00, 10.21it/s]

Time Series Analysis#

Now we’ll have a look at some of the Echonest time series features, for that we’ll pick a couple of random songs

Note

In the following plots, we show the prediction confidence by the size of the markers, the higher the confidence, the smaller the marker.

%%sql -o track_timeseries_df -t df
select 
    track_hk,
    title,
    year,
    bars_start,
    bars_confidence,
    beats_start,
    beats_confidence,
    sections_start,
    sections_confidence,
    segments_start,
    segments_confidence,
    segments_loudness_max,
    segments_loudness_max_time,
    segments_loudness_start,
    segments_pitches,
    segments_timbre,
    tatums_start,
    tatums_confidence,
from lakh_remote.sat_track 
where track_hk in (
    '029f0cec6a749b64f45f27b8a7c56125',
    '02a93439e9627559dbc54f3a66f69c8c',
    '00d99d980159c5f67340a12debe54eae'
)
track_hk title year bars_start bars_confidence beats_start beats_confidence sections_start sections_confidence segments_start segments_confidence segments_loudness_max segments_loudness_max_time segments_loudness_start segments_pitches segments_timbre tatums_start tatums_confidence
0 00d99d980159c5f67340a12debe54eae Charleston Rag 1995 [0.84439, 3.051, 5.22457, 7.36452, 9.61086, 11... [0.164, 0.142, 0.085, 0.097, 0.451, 0.173, 0.0... [0.84439, 1.39933, 1.95153, 2.49684, 3.051, 3.... [0.699, 0.329, 0.152, 0.289, 0.322, 0.0, 0.266... [0.0, 9.04906, 33.65799, 53.1484, 79.0696, 150... [1.0, 0.63, 0.433, 0.405, 0.859, 0.484] [0.0, 0.52812, 0.64172, 0.82694, 1.08286, 1.33... [0.0, 1.0, 0.765, 1.0, 1.0, 1.0, 0.777, 1.0, 0... [-58.19, -20.913, -13.657, -13.98, -15.292, -2... [0.51907, 0.0522, 0.0375, 0.02809, 0.05073, 0.... [-60.0, -58.209, -23.312, -34.633, -38.174, -4... [[0.891, 1.0, 0.73, 0.512, 0.496, 0.424, 0.579... [[0.0, 171.13, 9.469, -28.48, 57.491, -50.067,... [0.55921, 0.84439, 1.12255, 1.39933, 1.67475, ... [1.0, 1.0, 0.992, 0.957, 0.906, 0.811, 0.77, 0...
1 029f0cec6a749b64f45f27b8a7c56125 Three Little Birds 2007 [1.15554, 2.81131, 4.46642, 6.12006, 7.77245, ... [0.027, 0.066, 0.043, 0.148, 0.035, 0.0, 0.043... [0.3349, 0.74599, 1.15554, 1.57161, 1.98251, 2... [0.795, 0.731, 0.884, 0.868, 0.602, 0.695, 0.9... [0.0, 10.66556, 40.38593, 67.5073, 91.37827, 1... [1.0, 0.105, 0.29, 0.515, 0.457, 0.52, 0.321, ... [0.0, 0.19442, 0.29324, 0.70526, 1.11152, 1.54... [1.0, 0.629, 0.886, 1.0, 1.0, 1.0, 1.0, 0.907,... [-20.266, -14.79, -12.545, -11.104, -23.31, -1... [0.08601, 0.02207, 0.02279, 0.02215, 0.05185, ... [-60.0, -28.663, -32.234, -40.38, -39.269, -35... [[1.0, 0.485, 0.451, 0.661, 0.6, 0.907, 0.577,... [[17.194, 63.479, 203.492, -313.641, 53.094, 5... [0.12427, 0.3349, 0.53841, 0.74599, 0.94544, 1... [0.79, 0.762, 0.739, 0.724, 0.831, 0.82, 0.797...
2 02a93439e9627559dbc54f3a66f69c8c Knockin on Heaven's Door 2003 [0.95185, 2.80171, 4.64989, 6.50502, 8.3715, 1... [0.157, 0.031, 0.181, 0.052, 0.041, 0.067, 0.2... [0.47983, 0.95185, 1.42146, 1.87786, 2.33864, ... [0.848, 0.258, 0.577, 0.345, 0.346, 0.31, 0.49... [0.0, 14.89326, 39.10174, 93.99826, 118.98417,... [1.0, 1.0, 0.636, 0.061, 0.114, 0.055, 0.398, ... [0.0, 0.36844, 1.40104, 2.08739, 2.51587, 3.21... [0.0, 1.0, 0.811, 0.348, 0.677, 0.806, 0.555, ... [-60.0, -14.161, -17.898, -18.651, -17.423, -1... [0.0, 0.12361, 0.04812, 0.04377, 0.04933, 0.06... [-60.0, -60.0, -27.149, -21.976, -24.148, -23.... [[0.942, 0.845, 0.838, 0.675, 0.883, 0.753, 0.... [[0.0, 171.13, 9.469, -28.48, 57.491, -50.067,... [0.23783, 0.47983, 0.71824, 0.95185, 1.18905, ... [0.606, 0.582, 0.498, 0.429, 0.373, 0.32, 0.25...

Bars Analysis#

Bars (also called measures) represent the primary metrical structure of music. A bar is a segment of time defined by a given number of beats.

Hide code cell source

def create_combined_data(df, start_col, confidence_col, metric_name):
    """Create combined dataframe for faceted plotting"""
    combined_data = []
    
    for _, row in df.iterrows():
        if row[start_col] is not None and row[confidence_col] is not None:
            try:
                track_data = pd.DataFrame({
                    'time': row[start_col],
                    'confidence': row[confidence_col],
                    'track': f"{row['title'][:20]}..." if len(row['title']) > 20 else row['title'],
                    'year': row['year'],
                    'track_id': row['track_hk'][:8]
                })
                combined_data.append(track_data)
            except Exception as e:
                print(f"Warning: Skipping track {row['track_hk']} for {metric_name}: {e}")
                continue
    
    return pd.concat(combined_data, ignore_index=True) if combined_data else pd.DataFrame()

def create_delta_data(df, start_col, confidence_col, metric_name):
    """Create delta time dataframe for any rhythmic element"""
    combined_data = []
    
    for _, row in df.iterrows():
        if row[start_col] is not None and row[confidence_col] is not None:
            try:
                times = row[start_col]
                confidences = row[confidence_col]
                
                # Calculate time deltas between consecutive events
                if len(times) > 1:
                    time_deltas = [times[i] - times[i-1] for i in range(1, len(times))]
                    # Use confidence from the second event in each pair
                    delta_confidences = confidences[1:]
                    # Use the second event's time as x-axis
                    delta_times = times[1:]
                    
                    track_data = pd.DataFrame({
                        'time': delta_times,
                        'time_delta': time_deltas,
                        'confidence': delta_confidences,
                        'inv_confidence': [1 - c for c in delta_confidences],
                        'track': f"{row['title']}" if pd.notna(row['title']) else 'Unknown',
                        'year': row['year'],
                        'track_id': row['track_hk'][:8]
                    })
                    combined_data.append(track_data)
            except Exception as e:
                print(f"Warning: Skipping track {row['track_hk']} for {metric_name}: {e}")
                continue
    
    return pd.concat(combined_data, ignore_index=True) if combined_data else pd.DataFrame()

# Bars time deltas visualization
bars_data = create_delta_data(track_timeseries_df, 'bars_start', 'bars_confidence', 'bars')
if not bars_data.empty:
    fig = px.scatter(bars_data, x='time', y='time_delta',
                    color='track', size='inv_confidence',
                    title="Bars Time Deltas Across Tracks (first 60s)",
                    hover_data=['year', 'track_id', 'confidence'],
                    range_x=(0, 60),
                    labels={'time_delta': 'Time Delta (seconds)', 'time': 'Time (seconds)'})
    
    # Calculate Hz range for secondary axis
    min_delta = bars_data['time_delta'].min()
    max_delta = bars_data['time_delta'].max()
    
    if min_delta > 0 and max_delta > 0:
        min_hz = 1 / max_delta
        max_hz = 1 / min_delta
        
        fig.update_layout(
            height=400,
            yaxis=dict(title="Time Delta (seconds)"),
            yaxis2=dict(
                title="Frequency (Hz)",
                overlaying="y",
                side="right",
                range=[min_hz, max_hz],
                tickmode='linear',
                tick0=min_hz,
                dtick=(max_hz - min_hz) / 5
            ),
            legend=dict(
                orientation="h",
                x=0.5,
                y=-0.2,
                xanchor="center",
                yanchor="top",
                bgcolor="rgba(255,255,255,0.8)"
            )
        )
        
        import plotly.graph_objects as go
        fig.add_trace(
            go.Scatter(
                x=[bars_data['time'].min()],
                y=[min_hz], 
                yaxis='y2',
                mode='markers',
                marker=dict(opacity=0),
                showlegend=False,
                hoverinfo='skip'
            )
        )
    
    fig.show()
else:
    print("No valid bars data found")
../../_images/1da297fa2ec81453bdcdb3006aa7ed48d6c7c009294e553257f5637f7289f2fc.png

Beats Analysis#

Beats are the basic time units of music - the regular pulse that listeners tap their feet to. In the Echo Nest system, beats represent the perceived tactus or main pulse of the music.

Hide code cell source

# Beats time deltas visualization
beats_data = create_delta_data(track_timeseries_df, 'beats_start', 'beats_confidence', 'beats')
if not beats_data.empty:
    fig = px.scatter(beats_data, x='time', y='time_delta',
                    color='track', size='inv_confidence',
                    title="Beats Time Deltas Across Tracks (first 30s)",
                    hover_data=['year', 'track_id', 'confidence'],
                    range_x=(0, 30),
                    labels={'time_delta': 'Time Delta (seconds)', 'time': 'Time (seconds)'})
    
    # Calculate BPM range for secondary axis
    min_delta = beats_data['time_delta'].min()
    max_delta = beats_data['time_delta'].max()
    
    if min_delta > 0 and max_delta > 0:
        min_bpm = 60 / max_delta
        max_bpm = 60 / min_delta
        
        fig.update_layout(
            height=400,
            yaxis=dict(title="Time Delta (seconds)"),
            yaxis2=dict(
                title="BPM",
                overlaying="y",
                side="right",
                range=[min_bpm, max_bpm],
                tickmode='linear',
                tick0=min_bpm,
                dtick=(max_bpm - min_bpm) / 5
            ),
            legend=dict(
                orientation="h",
                x=0.5,
                y=-0.2,
                xanchor="center",
                yanchor="top",
                bgcolor="rgba(255,255,255,0.8)"
            )
        )
        
        import plotly.graph_objects as go
        fig.add_trace(
            go.Scatter(
                x=[beats_data['time'].min()],
                y=[min_bpm], 
                yaxis='y2',
                mode='markers',
                marker=dict(opacity=0),
                showlegend=False,
                hoverinfo='skip'
            )
        )
    
    fig.show()
else:
    print("No valid beats data found")
../../_images/4300b2234698dd1a93450a823f3a6dd7e3e17ff39621ba3b407c5dd50031d922.png

Sections Analysis#

Sections identify large-scale structural elements of songs such as verses, choruses, bridges, and instrumental solos. They are defined by significant changes in rhythm, timbre, or harmonic content.

Hide code cell source

# Sections time deltas visualization
sections_data = create_delta_data(track_timeseries_df, 'sections_start', 'sections_confidence', 'sections')
if not sections_data.empty:
    fig = px.scatter(sections_data, x='time', y='time_delta',
                    color='track', size='inv_confidence',
                    title="Sections Time Deltas Across Tracks",
                    hover_data=['year', 'track_id', 'confidence'],
                    labels={'time_delta': 'Time Delta (seconds)', 'time': 'Time (seconds)'})
    
    # Calculate Hz range for secondary axis
    min_delta = sections_data['time_delta'].min()
    max_delta = sections_data['time_delta'].max()
    
    if min_delta > 0 and max_delta > 0:
        min_hz = 1 / max_delta
        max_hz = 1 / min_delta
        
        fig.update_layout(
            height=400,
            yaxis=dict(title="Time Delta (seconds)"),
            yaxis2=dict(
                title="Frequency (Hz)",
                overlaying="y",
                side="right",
                range=[min_hz, max_hz],
                tickmode='linear',
                tick0=min_hz,
                dtick=(max_hz - min_hz) / 5
            ),
            legend=dict(
                orientation="h",
                x=0.5,
                y=-0.2,
                xanchor="center",
                yanchor="top",
                bgcolor="rgba(255,255,255,0.8)"
            )
        )
        
        import plotly.graph_objects as go
        fig.add_trace(
            go.Scatter(
                x=[sections_data['time'].min()],
                y=[min_hz], 
                yaxis='y2',
                mode='markers',
                marker=dict(opacity=0),
                showlegend=False,
                hoverinfo='skip'
            )
        )
    
    fig.show()
else:
    print("No valid sections data found")
../../_images/c51c4e735276bc9ce3ce5cee833dd43488978456a8686d85dc06fb607714856e.png

Segments Analysis#

Segments are short-duration sound entities (typically under a second) that are relatively uniform in timbre and harmony. They represent the most granular level of Echo Nest’s temporal analysis.

Hide code cell source

# Segments visualization
def create_segments_data(df):
    """Create segments dataframe with multiple metrics"""
    combined_data = []
    
    for _, row in df.iterrows():
        if (row['segments_start'] is not None and 
            row['segments_confidence'] is not None and 
            row['segments_loudness_max'] is not None):
            try:
                track_data = pd.DataFrame({
                    'time': row['segments_start'],
                    'confidence': row['segments_confidence'],
                    'inv_confidence': 1-row['segments_confidence'],
                    'loudness_max': row['segments_loudness_max'],
                    'loudness_max_time': row['segments_loudness_max_time'] if row['segments_loudness_max_time'] is not None else [0] * len(row['segments_start']),
                    'loudness_start': row['segments_loudness_start'] if row['segments_loudness_start'] is not None else [0] * len(row['segments_start']),
                    'track': f"{row['title'][:20]}..." if len(row['title']) > 20 else row['title'],
                    'year': row['year'],
                    'track_id': row['track_hk'][:8]
                })
                combined_data.append(track_data)
            except Exception as e:
                print(f"Warning: Skipping track {row['track_hk']} for segments: {e}")
                continue
    
    return pd.concat(combined_data, ignore_index=True) if combined_data else pd.DataFrame()

segments_data = create_segments_data(track_timeseries_df)

# Loudness visualization
fig2 = px.scatter(segments_data, x='time', y='loudness_max',
                    color='track', size='inv_confidence',
                    title="Segments Loudness Analysis Across Tracks",
                    hover_data=['year', 'track_id'],
                    range_x=(0,4),
                    labels={
                        'loudness_max': 'Max Loudness (dB)',
                        'time': 'Time (s)',
                    }
                    )
fig2.update_layout(height=400)
fig2.show()
../../_images/d97db1e132ecd43810e9787613003497a9805e15d4c5afe34ef6a66e4c5e46cf.png

Pitch Chroma Analysis#

Pitch chroma vectors are 12-dimensional representations of harmonic content, corresponding to the 12 pitch classes of Western music: C, C#, D, D#, E, F, F#, G, G#, Pitch chroma vectors are 12-dimensional representations of harmonic content, corresponding to the 12 pitch classes of Western music: C, C#, D, D#, E, F, F#, G, G#, A, A#, B.A, A#, B.

Hide code cell source

# tags: ["hide-input"]
def create_pitch_subplots(df, n_segments):
    """Create pitch chroma heatmaps using subplots"""
    # Filter for rows with valid pitch data
    valid_rows = []
    for _, row in df.iterrows():
        if row['segments_pitches'] is not None and row['segments_start'] is not None:
            valid_rows.append(row)
    
    if not valid_rows:
        print("No valid pitch data found")
        return
    
    # Create subplots
    from plotly.subplots import make_subplots
    import plotly.graph_objects as go
    
    track_titles = [f"{row['title']}" for row in valid_rows]
    
    fig = make_subplots(
        rows=len(valid_rows), cols=1,
        subplot_titles=track_titles,
        shared_xaxes=False,
        vertical_spacing=0.15
    )
    
    for i, row in enumerate(valid_rows):
        try:
            pitches = np.stack(row['segments_pitches'])
            times = row['segments_start']
            
            fig.add_trace(
                go.Heatmap(
                    z=pitches.T[:, :n_segments],
                    x=times[:n_segments],
                    y=['C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', 'A#', 'B'],
                    colorscale='Viridis',
                    showscale=i==0,
                    colorbar=dict(title="Pitch Intensity") if i==0 else None
                ),
                row=i+1, col=1
            )
        except Exception as e:
            print(f"Warning: Skipping pitch visualization for track {row['track_hk']}: {e}")
    
    fig.update_layout(
        title=f"Pitch Chroma Analysis Across Tracks ({n_segments} segments)",
        height=200 * len(valid_rows),
        showlegend=False
    )
    
    # Update axes labels
    fig.update_yaxes(title_text="Pitch Class", row=1, col=1)
    fig.update_xaxes(title_text="Time (seconds)", row=len(valid_rows), col=1)
    
    fig.show()

create_pitch_subplots(track_timeseries_df, 30)
../../_images/6783db946b187504228f3ba7899a091432a88649c412332f846ffff0874df04e.png

Hide code cell source

create_pitch_subplots(track_timeseries_df, 150)
../../_images/793bfee146bb4e6b3b9ec1b5fd1df7e26a43ef018b3a98f53cf7811af8c8b52c.png

Hide code cell source

create_pitch_subplots(track_timeseries_df, 300)
../../_images/5b49ff519f5516b9bb70769b1ade503efdf93702498882b1094a5ff4e0ac5dc5.png

Timbre Analysis#

Timbre vectors are 12-dimensional representations capturing the β€œtone color” or spectral characteristics of audio segments. These describe perceptual qualities like brightness, roughness, and spectral shape.

Hide code cell source

# Timbre heatmaps using subplots
def create_timbre_subplots(df, n_segments=30):
    """Create timbre heatmaps using subplots"""
    # Filter for rows with valid timbre data
    valid_rows = []
    for _, row in df.iterrows():
        if row['segments_timbre'] is not None and row['segments_start'] is not None:
            valid_rows.append(row)
    
    if not valid_rows:
        print("No valid timbre data found")
        return
    
    # Create subplots
    from plotly.subplots import make_subplots
    import plotly.graph_objects as go
    
    track_titles = [f"{row['title']}" for row in valid_rows]
    
    fig = make_subplots(
        rows=len(valid_rows), cols=1,
        subplot_titles=track_titles,
        shared_xaxes=False,
        vertical_spacing=0.15
    )
    
    for i, row in enumerate(valid_rows):
        try:
            timbres = np.stack(row['segments_timbre'])
            times = row['segments_start']
            # raise ValueError(timbres.shape, times[:8].shape, row['segments_timbre'].shape)
            
            fig.add_trace(
                go.Heatmap(
                    z=timbres.T[:,:n_segments],
                    x=times[:n_segments],
                    y=[f'Timbre_{j+1}' for j in range(12)],
                    colorscale='Plasma',
                    showscale=i==0,
                    colorbar=dict(title="Timbre Intensity") if i==0 else None
                ),
                row=i+1, col=1
            )
        except Exception as e:
            raise
            print(f"Warning: Skipping timbre visualization for track {row['track_hk']}: {e}")
    
    fig.update_layout(
        title=f"Timbre Analysis Across Tracks (first {n_segments} segments)", 
        height=200 * len(valid_rows),
        showlegend=False
    )
    
    # Update axes labels
    fig.update_yaxes(title_text="Timbre Dimension", row=1, col=1)
    fig.update_xaxes(title_text="Time (seconds)", row=len(valid_rows), col=1)
    
    fig.show()

create_timbre_subplots(track_timeseries_df, 30)
../../_images/a237b278fe30f68f142f75eb35cbe56d6a8ee7f7c52f6dd76dc99b56a7bbb420.png

Hide code cell source

create_timbre_subplots(track_timeseries_df, 300)
../../_images/a014954c5f4e199173517a93d93e10e95ca9ba9853b924e161a7804eb48f7b7b.png

Hide code cell source

create_timbre_subplots(track_timeseries_df, 3000)
../../_images/ad033361c3a64e53b266e9e14a3e553de81a3f78b15af7e6481b054131944e27.png

Tatums Analysis#

Tatums represent the smallest regular pulse that listeners intuitively infer from musical timing. Named after jazz pianist Art Tatum β€œwhose tatum was faster than all others”, tatums capture micro-rhythmic subdivisions.

Hide code cell source

# Tatums visualization - time deltas between tatums
def create_tatums_delta_data(df):
    """Create tatums dataframe with time deltas between consecutive tatums"""
    combined_data = []
    
    for _, row in df.iterrows():
        if row['tatums_start'] is not None and row['tatums_confidence'] is not None:
            try:
                times = row['tatums_start']
                confidences = row['tatums_confidence']
                
                # Calculate time deltas between consecutive tatums
                if len(times) > 1:
                    time_deltas = [times[i] - times[i-1] for i in range(1, len(times))]
                    # Use confidence from the second tatum in each pair
                    delta_confidences = confidences[1:]
                    # Use the second tatum's time as x-axis
                    delta_times = times[1:]
                    
                    track_data = pd.DataFrame({
                        'time': delta_times,
                        'time_delta': time_deltas,
                        'confidence': delta_confidences,
                        'track': f"{row['title']}" if pd.notna(row['title']) else 'Unknown',
                        'year': row['year'],
                        'track_id': row['track_hk'][:8]
                    })
                    combined_data.append(track_data)
            except Exception as e:
                print(f"Warning: Skipping track {row['track_hk']} for tatums: {e}")
                continue
    
    return pd.concat(combined_data, ignore_index=True) if combined_data else pd.DataFrame()

Hide code cell source

# Tatums time deltas visualization
tatums_data = create_tatums_delta_data(track_timeseries_df)
fig2 = px.line(tatums_data, x='time', y='time_delta',
                    color='track', 
                    title="Tatums Time Deltas Across Tracks (first 100s)",
                    hover_data=['year', 'track_id', 'confidence'],
                    range_x=(0, 100),
                    labels={'time_delta': 'Time Delta (seconds)', 'time': 'Time (seconds)'})

# Calculate Hz range for secondary axis
min_delta = tatums_data['time_delta'].min()
max_delta = tatums_data['time_delta'].max()

if min_delta > 0 and max_delta > 0:
    # Convert to Hz: 1 / time_delta
    min_hz = 1 / max_delta
    max_hz = 1 / min_delta
    
    # Add secondary y-axis for Hz
    fig2.update_layout(
        height=400,
        yaxis=dict(title="Time Delta (seconds)"),
        yaxis2=dict(
            title="Frequency (Hz)",
            overlaying="y",
            side="right",
            range=[min_hz, max_hz],
            tickmode='linear',
            tick0=min_hz,
            dtick=(max_hz - min_hz) / 5
        ),
        # Position legend below plot to avoid covering data
        legend=dict(
            orientation="h",
            x=0.5,
            y=-0.2,
            xanchor="center",
            yanchor="top",
            bgcolor="rgba(255,255,255,0.8)"
        )
    )
    
    # Add invisible trace to force secondary axis to appear
    import plotly.graph_objects as go
    fig2.add_trace(
        go.Scatter(
            x=[tatums_data['time'].min()],
            y=[min_hz], 
            yaxis='y2',
            mode='markers',
            marker=dict(opacity=0),
            showlegend=False,
            hoverinfo='skip'
        )
    )

fig2.show()
../../_images/f0153eaef79562cdc05afd78ff370d52cdeefba13ee1f5a1b258851d9049b0cd.png

Tatums Condfidence#

Here we show the confidence in the tatums predictions over the first 100s

Hide code cell source

# Original tatums confidence visualization
tatums_confidence_data = create_combined_data(track_timeseries_df, 'tatums_start', 'tatums_confidence', 'tatums')
fig1 = px.scatter(tatums_confidence_data, x='time', y='confidence',
                    color='track',
                    title="Tatums Confidence Across Tracks (first 100s)",
                    hover_data=['year', 'track_id'],
                    range_x=(0, 100))
fig1.update_layout(height=400)
fig1.show()
../../_images/9934188fee175f53c61c3aa045d91302c295127b1a55a4006908688be51a8a75.png

Summary#

This analysis provides a comprehensive view of the musical structure and acoustic features in the Lakh MIDI Dataset:

  • Rhythmic hierarchy: From tatums (smallest) to sections (largest)

  • Harmonic content: Pitch chroma vectors showing chord progressions

  • Timbral characteristics: Spectral features capturing sound texture

  • Temporal patterns: How musical elements evolve over time

The faceted visualizations enable comparison across different tracks, revealing both common patterns and unique characteristics in the dataset.